1
Simple RAG
The basic retrieve-and-generate mechanism, suitable for prototyping and straightforward queries.
2
RAG with Memory
Incorporates short or long-term dialogue context and user history to enable more natural, multi-turn conversations.
3
Branched RAG
For complex questions, an LLM decomposes the user's query into multiple sub-questions, executes parallel retrieval, and synthesizes results.
4
HyDE (Hypothetical Document Embeddings)
An LLM generates a hypothetical answer, which is then embedded and used as the search vector, improving retrieval quality.
5
Adaptive RAG
Employs a routing layer to dynamically decide if retrieval is needed and what complexity of strategy is most appropriate, optimizing cost and efficiency.
6
Corrective RAG (CRAG)
Adds an evaluation step post-retrieval; if confidence is low, the system reformulates the query or performs a web search for better info.
7
Self RAG
The LLM itself generates "reflection tokens" to self-critique its reasoning and the relevance of retrieved context in real-time.
8
Agentic RAG
A multi-agent workflow where an LLM orchestrates dynamic actions (search, API call, code execution) in a continuous loop for complex tasks.
9
Multimodal RAG
Extends RAG beyond text to include visual and structured data, using vision language models or image embeddings for richer insights.
10
Graph RAG
Builds a knowledge graph over documents, explicitly mapping entities and their relationships for multi-hop reasoning on complex questions.